NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Machine learning evaluation in the Global Event Processor FPGA for the ATLAS trigger upgrade

https://doi.org/10.1088/1748-0221/19/05/P05031

Jiang, Zhixing; Carlson, Ben; Deiana, Allison; Eastlack, Jeff; Hauck, Scott; Hsu, Shih-Chieh; Narayan, Rohin; Parajuli, Santosh; Yin, Dennis; Zuo, Bowen (May 2024, Journal of Instrumentation)

Abstract The Global Event Processor (GEP) FPGA is an area-constrained, performance-critical element of the Large Hadron Collider's (LHC) ATLAS experiment. It needs to very quickly determine which small fraction of detected events should be retained for further processing, and which other events will be discarded. This system involves a large number of individual processing tasks, brought together within the overall Algorithm Processing Platform (APP), to make filtering decisions at an overall latency of no more than 8ms. Currently, such filtering tasks are hand-coded implementations of standard deterministic signal processing tasks.In this paper we present methods to automatically create machine learning based algorithms for use within the APP framework, and demonstrate several successful such deployments. We leverage existing machine learning to FPGA flows such ashls4mlandfwXto significantly reduce the complexity of algorithm design. These have resulted in implementations of various machine learning algorithms with latencies of 1.2 μs and less than 5% resource utilization on an Xilinx XCVU9P FPGA. Finally, we implement these algorithms into the GEP system and present their actual performance.Our work shows the potential of using machine learning in the GEP for high-energy physics applications. This can significantly improve the performance of the trigger system and enable the ATLAS experiment to collect more data and make more discoveries. The architecture and approach presented in this paper can also be applied to other applications that require real-time processing of large volumes of data.
more » « less
Full Text Available
Low Latency Edge Classification GNN for Particle Trajectory Tracking on FPGAs

https://doi.org/10.1109/FPL60245.2023.00050

Huang, Shi-Yu; Yang, Yun-Chen; Su, Yu-Ru; Lai, Bo-Cheng; Duarte, Javier; Hauck, Scott; Hsu, Shih-Chieh; Hu, Jin-Xuan; Neubauer, Mark S. (September 2023, IEEE)

In-time particle trajectory reconstruction in the Large Hadron Collider is challenging due to the high collision rate and numerous particle hits. Using GNN (Graph Neural Network) on FPGA has enabled superior accuracy with flexible trajectory classification. However, existing GNN architectures have inefficient resource usage and insufficient parallelism for edge classification. This paper introduces a resource-efficient GNN architecture on FPGAs for low latency particle tracking. The modular architecture facilitates design scalability to support large graphs. Leveraging the geometric properties of hit detectors further reduces graph complexity and resource usage. Our results on Xilinx UltraScale+ VU9P demonstrate 1625x and 1574x performance improvement over CPU and GPU respectively.
more » « less
Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml

https://doi.org/10.1088/2632-2153/acc0d7

Khoda, Elham E; Rankin, Dylan; Teixeira de Lima, Rafael; Harris, Philip; Hauck, Scott; Hsu, Shih-Chieh; Kagan, Michael; Loncar, Vladimir; Paikara, Chaitanya; Rao, Richa; et al (April 2023, Machine Learning: Science and Technology)

Abstract Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programmable gate arrays (FPGAs). In this paper we present an implementation of two types of recurrent neural network layers—long short-term memory and gated recurrent unit—within the hls4ml framework. We demonstrate that our implementation is capable of producing effective designs for both small and large models, and can be customized to meet specific design requirements for inference latencies and FPGA resources. We show the performance and synthesized designs for multiple neural networks, many of which are trained specifically for jet identification tasks at the CERN Large Hadron Collider.
more » « less
Full Text Available
Graph Neural Networks for Charged Particle Tracking on FPGAs

https://doi.org/10.3389/fdata.2022.828666

Elabd, Abdelrahman; Razavimaleki, Vesal; Huang, Shi-Yu; Duarte, Javier; Atkinson, Markus; DeZoort, Gage; Elmer, Peter; Hauck, Scott; Hu, Jin-Xuan; Hsu, Shih-Chieh; et al (March 2022, Frontiers in Big Data)

The determination of charged particle trajectories in collisions at the CERN Large Hadron Collider (LHC) is an important but challenging problem, especially in the high interaction density conditions expected during the future high-luminosity phase of the LHC (HL-LHC). Graph neural networks (GNNs) are a type of geometric deep learning algorithm that has successfully been applied to this task by embedding tracker data as a graph—nodes represent hits, while edges represent possible track segments—and classifying the edges as true or fake track segments. However, their study in hardware- or software-based trigger applications has been limited due to their large computational cost. In this paper, we introduce an automated translation workflow, integrated into a broader tool called hls4ml , for converting GNNs into firmware for field-programmable gate arrays (FPGAs). We use this translation tool to implement GNNs for charged particle tracking, trained using the TrackML challenge dataset, on FPGAs with designs targeting different graph sizes, task complexites, and latency/throughput requirements. This work could enable the inclusion of charged particle tracking GNNs at the trigger level for HL-LHC experiments.
more » « less
Full Text Available
QONNX: Representing Arbitrary-Precision Quantized Neural Networks

Pappalardo, Alessandro; Umuroglu, Yaman; Blott, Michaela; Mitrevski, Jovan; Hawks, Ben; Tran, Nhan; Loncar, Vladimir; Summers, Sioni; Borras, Hendrik; Muhizi, Jules; et al (January 2022, Fermi National Accelerator Lab)

Full Text Available
GPU coprocessors as a service for deep learning inference in high energy physics

https://doi.org/10.1088/2632-2153/abec21

Krupa, Jeffrey; Lin, Kelvin; Acosta Flechas, Maria; Dinsmore, Jack; Duarte, Javier; Harris, Philip; Hauck, Scott; Holzman, Burt; Hsu, Shih-Chieh; Klijnsma, Thomas; et al (April 2021, Machine Learning: Science and Technology)
null (Ed.)
Full Text Available
FPGAs-as-a-Service Toolkit (FaaST)

Rankin, Dylan; Krupa, Jeffrey; Harris, Philip; Flechas, Maria; Holzman, Burt; Klijnsma, Thomas; Pedro, Kevin; Tran, Nhan; Hauck, Scott; Hsu, Shih-Chieh; et al (October 2020, ArXivorg)
null (Ed.)
Computing needs for high energy physics are already intensive and are expected to increase drastically in the coming years. In this context, heterogeneous computing, specifically as-a-service computing, has the potential for significant gains over traditional computing models. Although previous studies and packages in the field of heterogeneous computing have focused on GPUs as accelerators, FPGAs are an extremely promising option as well. A series of workflows are developed to establish the performance capabilities of FPGAs as a service. Multiple different devices and a range of algorithms for use in high energy physics are studied. For a small, dense network, the throughput can be improved by an order of magnitude with respect to GPUs as a service. For large convolutional networks, the throughput is found to be comparable to GPUs as a service. This work represents the first open-source FPGAs-as-a-service toolkit.
more » « less
Full Text Available
FPGAs-as-a-Service Toolkit (FaaST)

https://doi.org/10.1109/H2RC51942.2020.00010

Rankin, Dylan; Krupa, Jeffrey; Harris, Philip; Flechas, Maria Acosta; Holzman, Burt; Klijnsma, Thomas; Pedro, Kevin; Tran, Nhan; Hauck, Scott; Hsu, Shih-Chieh; et al (November 2020, 2020 IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing (H2RC))
null (Ed.)
Computing needs for high energy physics are already intensive and are expected to increase drastically in the coming years. In this context, heterogeneous computing, specifically as-a-service computing, has the potential for significant gains over traditional computing models. Although previous studies and packages in the field of heterogeneous computing have focused on GPUs as accelerators, FPGAs are an extremely promising option as well. A series of workflows are developed to establish the performance capabilities of FPGAs as a service. Multiple different devices and a range of algorithms for use in high energy physics are studied. For a small, dense network, the throughput can be improved by an order of magnitude with respect to GPUs as a service. For large convolutional networks, the throughput is found to be comparable to GPUs as a service. This work represents the first open-source FPGAs-as-a-service toolkit.
more » « less
Full Text Available
hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

Fahim, Farah; Hawks, Benjamin; Herwig, Christian; Hirschauer, James; Jindariani, Serge; Nhan, Trần; Carloni, Luca; DiGuglielmo, Giuseppe; Harris, Phillip; Krupa, Jeffrey; et al (April 2021, ArXivorg)
null (Ed.)
Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC technologies. We expand on previous hls4ml work by extending capabilities and techniques towards low-power implementations and increased usability: new Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long pipeline kernels for low power, and new device backends include an ASIC workflow. Taken together, these and continued efforts in hls4ml will arm a new generation of domain scientists with accessible, efficient, and powerful tools for machine-learning-accelerated discovery.
more » « less
Full Text Available
Applications and Techniques for Fast Machine Learning in Science

https://doi.org/10.3389/fdata.2022.787421

Deiana, Allison McCarn; Tran, Nhan; Agar, Joshua; Blott, Michaela; Di Guglielmo, Giuseppe; Duarte, Javier; Harris, Philip; Hauck, Scott; Liu, Mia; Neubauer, Mark S.; et al (April 2022, Frontiers in Big Data)

In this community review report, we discuss applications and techniques for fast machine learning (ML) in science—the concept of integrating powerful ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlapping challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs.
more » « less
Full Text Available

« Prev Next »

Search for: All records